Speech estimation in non-stationary noise environments using timing structures between mouth movements and sound signals

نویسندگان

  • Hiroaki Kawashima
  • Yu Horii
  • Takashi Matsuyama
چکیده

A variety of methods for audio-visual integration, which integrate audio and visual information at the level of either features, states, or classifier outputs, have been proposed for the purpose of robust speech recognition. However, these methods do not always fully utilize auditory information when the signal-to-noise ratio becomes low. In this paper, we propose a novel approach to estimate speech signal in noise environments. The key idea behind this approach is to exploit clean speech candidates generated by using timing structures between mouth movements and sound signals. We first extract a pair of feature sequences of media signals and segment each sequence into temporal intervals. Then, we construct a cross-media timing-structure model of human speech by learning the temporal relations of overlapping intervals. Based on the learned model, we generate clean speech candidates from the observed mouth movements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

سایکوآکوستیک و درک گفتار در افراد مبتلا به نوروپاتی شنوایی و افراد طبیعی

Background: The main result of hearing impairment is reduction of speech perception. Patient with auditory neuropathy can hear but they can not understand. Their difficulties have been traced to timing related deficits, revealing the importance of the neural encoding of timing cues for understanding speech. Objective: In the present study psychoacoustic perception (minimal noticeable differen...

متن کامل

Signal processing techniques for seat belt microphone arrays

Microphones integrated on a seat belt are an interesting alternative to conventional sensor positions used for handsfree telephony or speech dialog systems in automobile environments. In the setup presented in this contribution, the seat belt consists of three microphones which usually lay around the shoulder and chest of a sitting passenger. The main benefit of belt microphones is the small di...

متن کامل

The effect of redesign workstation on Speech Interference Level (SIL) among bank tellers

Abstract Background: There is always an interaction between man and his environment that can be the cause of physical, physiological and psychological stress on people and also cause discomfort, annoyance, and have direct and indirect effects on their performance and productivity, health and safety. People in their workplace are exposed to many factors related to work activities and environmen...

متن کامل

An Integrated Real-Time Beamforming and Postfiltering System for Nonstationary Noise Environments

In this paper, we present a novel approach for real-time multichannel speech enhancement in environments of non-stationary noise and time-varying acoustical transfer functions (ATFs). The proposed system integrates adaptive beamforming, ATF identification, soft signal detection, and multichannel postfiltering. The noise canceller branch of the beamformer and the ATF identification are adaptivel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010